Goto

Collaborating Authors

 symbolic program


Gradient-Based Program Repair: Fixing Bugs in Continuous Program Spaces

Silva, André, Thorén, Gustav, Monperrus, Martin

arXiv.org Artificial Intelligence

Automatic program repair seeks to generate correct code from buggy programs, with most approaches searching the correct program in a discrete, symbolic space of source code tokens. This symbolic search is fundamentally limited by its inability to directly reason about program behavior. We introduce Gradient-Based Program Repair (GBPR), a new paradigm that reframes program repair as continuous optimization in a differentiable numerical program space. Our core insight is to compile symbolic programs into differentiable numerical representations, enabling search in the numerical program space directly guided by program behavior. To evaluate GBPR, we present RaspBugs, a new benchmark of 1,466 buggy symbolic RASP programs and their respective numerical representations. Our experiments demonstrate that GBPR can effectively repair buggy symbolic programs by gradient-based optimization in the numerical program space, with convincing repair trajectories. To our knowledge, we are the first to state program repair as continuous optimization in a numerical program space. Our work establishes a new direction for program repair research, bridging two rich worlds: continuous optimization and program behavior.


Reinforcement Learning with Physics-Informed Symbolic Program Priors for Zero-Shot Wireless Indoor Navigation

Li, Tao, Lei, Haozhe, Yin, Mingsheng, Hu, Yaqi

arXiv.org Artificial Intelligence

When using reinforcement learning (RL) to tackle physical control tasks, inductive biases that encode physics priors can help improve sample efficiency during training and enhance generalization in testing. However, the current practice of incorporating these helpful physics-informed inductive biases inevitably runs into significant manual labor and domain expertise, making them prohibitive for general users. This work explores a symbolic approach to distill physics-informed inductive biases into RL agents, where the physics priors are expressed in a domain-specific language (DSL) that is human-readable and naturally explainable. Y et, the DSL priors do not translate directly into an implementable policy due to partial and noisy observations and additional physical constraints in navigation tasks. To address this gap, we develop a physics-informed program-guided RL (PiPRL) framework with applications to indoor navigation. PiPRL adopts a hierarchical and modularized neuro-symbolic integration, where a meta symbolic program receives semantically meaningful features from a neural perception module, which form the bases for symbolic programming that encodes physics priors and guides the RL process of a low-level neural controller. Extensive experiments demonstrate that PiPRL consistently outperforms purely symbolic or neural policies and reduces training time by over 26% with the help of the program-based inductive biases.


How Do Transformers Learn Variable Binding in Symbolic Programs?

Wu, Yiwei, Geiger, Atticus, Millière, Raphaël

arXiv.org Artificial Intelligence

Variable binding -- the ability to associate variables with values -- is fundamental to symbolic computation and cognition. Although classical architectures typically implement variable binding via addressable memory, it is not well understood how modern neural networks lacking built-in binding operations may acquire this capacity. We investigate this by training a Transformer to dereference queried variables in symbolic programs where variables are assigned either numerical constants or other variables. Each program requires following chains of variable assignments up to four steps deep to find the queried value, and also contains irrelevant chains of assignments acting as distractors. Our analysis reveals a developmental trajectory with three distinct phases during training: (1) random prediction of numerical constants, (2) a shallow heuristic prioritizing early variable assignments, and (3) the emergence of a systematic mechanism for dereferencing assignment chains. Using causal interventions, we find that the model learns to exploit the residual stream as an addressable memory space, with specialized attention heads routing information across token positions. This mechanism allows the model to dynamically track variable bindings across layers, resulting in accurate dereferencing. Our results show how Transformer models can learn to implement systematic variable binding without explicit architectural support, bridging connectionist and symbolic approaches. To facilitate reproducible research, we developed Variable Scope, an interactive web platform for exploring our findings at https://variablescope.org


Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning

Naik, Aaditya, Liu, Jason, Wang, Claire, Dutta, Saikat, Naik, Mayur, Wong, Eric

arXiv.org Artificial Intelligence

Neurosymbolic learning has emerged as a promising paradigm to incorporate symbolic reasoning into deep learning models. However, existing frameworks are limited in scalability with respect to both the training data and the complexity of symbolic programs. We propose Dolphin, a framework to scale neurosymbolic learning at a fundamental level by mapping both forward chaining and backward gradient propagation in symbolic programs to vectorized computations. For this purpose, Dolphin introduces a set of abstractions and primitives built directly on top of a high-performance deep learning framework like PyTorch, effectively enabling symbolic programs to be written as PyTorch modules. It thereby enables neurosymbolic programs to be written in a language like Python that is familiar to developers and compile them to computation graphs that are amenable to end-to-end differentiation on GPUs. We evaluate Dolphin on a suite of 13 benchmarks across 5 neurosymbolic tasks that combine deep learning models for text, image, or video processing with symbolic programs that involve multi-hop reasoning, recursion, and even black-box functions like Python eval(). Dolphin only takes 0.33%-37.17% of the time (and 2.77% on average) to train these models on the largest input per task compared to baselines Scallop, ISED, and IndeCateR+, which time out on most of these inputs. Models written in Dolphin also achieve state-of-the-art accuracies even on the largest benchmarks.


Closed Loop Interactive Embodied Reasoning for Robot Manipulation

Nazarczuk, Michal, Behrens, Jan Kristof, Stepanova, Karla, Hoffmann, Matej, Mikolajczyk, Krystian

arXiv.org Artificial Intelligence

Embodied reasoning systems integrate robotic hardware and cognitive processes to perform complex tasks typically in response to a natural language query about a specific physical environment. This usually involves changing the belief about the scene or physically interacting and changing the scene (e.g. 'Sort the objects from lightest to heaviest'). In order to facilitate the development of such systems we introduce a new simulating environment that makes use of MuJoCo physics engine and high-quality renderer Blender to provide realistic visual observations that are also accurate to the physical state of the scene. Together with the simulator we propose a new benchmark composed of 10 classes of multi-step reasoning scenarios that require simultaneous visual and physical measurements. Finally, we develop a new modular Closed Loop Interactive Reasoning (CLIER) approach that takes into account the measurements of non-visual object properties, changes in the scene caused by external disturbances as well as uncertain outcomes of robotic actions. We extensively evaluate our reasoning approach in simulation and in the real world manipulation tasks with a success rate above 76% and 64%, respectively.


Symbolic Synthesis of Neural Networks

Whitehouse, Eli

arXiv.org Artificial Intelligence

Neural networks adapt very well to distributed and continuous representations, but struggle to generalize from small amounts of data. Symbolic systems commonly achieve data efficient generalization by exploiting modularity to benefit from local and discrete features of a representation. These features allow symbolic programs to be improved one module at a time and to experience combinatorial growth in the values they can successfully process. However, it is difficult to design a component that can be used to form symbolic abstractions and which is adequately overparametrized to learn arbitrary high-dimensional transformations. I present Graph-based Symbolically Synthesized Neural Networks (G-SSNNs), a class of neural modules that operate on representations modified with synthesized symbolic programs to include a fixed set of local and discrete features. I demonstrate that the choice of injected features within a G-SSNN module modulates the data efficiency and generalization of baseline neural models, creating predictable patterns of both heightened and curtailed generalization. By training G-SSNNs, we also derive information about desirable semantics of symbolic programs without manual engineering. This information is compact and amenable to abstraction, but can also be flexibly recontextualized for other high-dimensional settings. In future work, I will investigate data efficient generalization and the transferability of learned symbolic representations in more complex G-SSNN designs based on more complex classes of symbolic programs. Experimental code and data are available at https://github.com/shlomenu/symbolically_synthesized_networks .


Learning Neuro-symbolic Programs for Language Guided Robot Manipulation

Kalithasan, Namasivayam, Singh, Himanshu, Bindal, Vishal, Tuli, Arnav, Agrawal, Vishwajeet, Jain, Rahul, Singla, Parag, Paul, Rohan

arXiv.org Artificial Intelligence

Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training [1] (ii) infer action sequences from instructions but require dense sub-goal supervision [2] or (iii) lack semantics required for deeper object-centric reasoning inherent in interpreting complex instructions [3]. In contrast, our approach can handle linguistic as well as perceptual variations, end-to-end trainable and requires no intermediate supervision. The proposed model uses symbolic reasoning constructs that operate on a latent neural object-centric representation, allowing for deeper reasoning over the input scene. Central to our approach is a modular structure consisting of a hierarchical instruction parser and an action simulator to learn disentangled action representations. Our experiments on a simulated environment with a 7-DOF manipulator, consisting of instructions with varying number of steps and scenes with different number of objects, demonstrate that our model is robust to such variations and significantly outperforms baselines, particularly in the generalization settings. The code, dataset and experiment videos are available at https://nsrmp.github.io


AI's next big leap

#artificialintelligence

A few years ago, scientists learned something remarkable about mallard ducklings. If one of the first things the ducklings see after birth is two objects that are similar, the ducklings will later follow new pairs of objects that are similar, too. Hatchlings shown two red spheres at birth will later show a preference for two spheres of the same color, even if they are blue, over two spheres that are each a different color. Somehow, the ducklings pick up and imprint on the idea of similarity, in this case the color of the objects. What the ducklings do so effortlessly turns out to be very hard for artificial intelligence. This is especially true of a branch of AI known as deep learning or deep neural networks, the technology powering the AI that defeated the world's Go champion Lee Sedol in 2016. Such deep nets can struggle to figure out simple abstract relations between objects and reason about them unless they study tens or even hundreds of thousands of examples.


Have You Heard of Neurosymbolic AI? - The Wire Science

#artificialintelligence

A few years ago, scientists learned something remarkable about mallard ducklings. If one of the first things the ducklings see after birth is two objects that are similar, the ducklings will later follow new pairs of objects that are similar, too. Hatchlings shown two red spheres at birth will later show a preference for two spheres of the same colour, even if they are blue, over two spheres that are each a different colour. Somehow, the ducklings pick up and imprint on the idea of similarity, in this case the color of the objects. What the ducklings do so effortlessly turns out to be very hard for artificial intelligence. This is especially true of a branch of AI known as deep learning or deep neural networks, the technology powering the AI that defeated the world's Go champion Lee Sedol in 2016.


Teaching machines to reason about what they see

#artificialintelligence

A child who has never seen a pink elephant can still describe one -- unlike a computer. "The computer learns from data," says Jiajun Wu, a PhD student at MIT. "The ability to generalize and recognize something you've never seen before -- a pink elephant -- is very hard for machines." Deep learning systems interpret the world by picking out statistical patterns in data. This form of machine learning is now everywhere, automatically tagging friends on Facebook, narrating Alexa's latest weather forecast, and delivering fun facts via Google search. But statistical learning has its limits.